Selectivity Estimation for Exclusive Query Translation in Deep Web Data Integration
نویسندگان
چکیده
In Deep Web data integration, some Web database interfaces express exclusive predicates of the form Qe = Pi(Pi ∈ P1, P2, . . . , Pm), which permits only one predicate to be selected at a time. Accurately and efficiently estimating the selectivity of each Qe is of critical importance to optimal query translation. In this paper, we mainly focus on the selectivity estimation on infinite-value attribute which is more difficult than that on key attribute and categorical attribute. Firstly, we compute the attribute correlation and retrieve approximate random attribute-level samples through submitting queries on the least correlative attribute to the actual Web database. Then we estimate Zipf equation based on the word rank of the sample and the actual selectivity of several words from the actual Web database. Finally, the selectivity of any word on the infinite-value attribute can be derived by the Zipf equation. An experimental evaluation of the proposed selectivity estimation method is provided and experimental results are highly accurate.
منابع مشابه
A New Approach for Optimization of Dynamic Metric Access Methods Using an Algorithm of Effective Deletion
New Challenges in Petascale Scientific Databases p. 1 Adventures in the Blogosphere p. 2 The Evolution of Vertical Database Architectures A Historical Review p. 3 Query Optimization in Scientific Databases Linked Bernoulli Synopses: Sampling along Foreign Keys p. 6 Query Planning for Searching Inter-dependent Deep-Web Databases p. 24 Summarizing Two-Dimensional Data with Skyline-Based Statistic...
متن کاملOn-the-Fly Constraint Mapping across Web Query Interfaces
Recently, the Web has been rapidly “deepened” with the prevalence of databases online and becomes an important frontier for data integration. On this deep Web, a significant amount of information can only be accessed as response to dynamically issued queries to the query interface of a back-end database, instead of by traversing static URL links. Such a query interface expresses a set of constr...
متن کاملProgressive Deep Web Crawling Through Keyword Queries For Data Enrichment
Data enrichment is the act of extending a local database with new attributes from external data sources. In this paper, we study a novel problem—how to progressively crawl the deep web (i.e., a hidden database) through a keywordsearch interface for data enrichment. This is challenging because these interfaces often enforce a top-k constraint, or they have limits on the number of queries that ca...
متن کاملToward Large Scale Integration: Building a MetaQuerier over Databases on the Web
The Web has been rapidly “deepened” by myriad searchable databases online, where data are hidden behind query interfaces. Toward large scale integration over this “deep Web,” we have been building the MetaQuerier system– for both exploring (to find) and integrating (to query) databases on the Web. As an interim report, first, this paper proposes our goal of the MetaQuerier for Web-scale integra...
متن کاملOntology Based Automatic Attributes Extracting and Queries Translating for Deep Web
Search engines and web crawlers can not access the Deep Web directly. The workable way to access the hidden database is through query interfaces. Automatic extracting attributes from query interfaces and translating queries is a solvable way for addressing the current limitations in accessing Deep Web. However, the query interface provides semantic constraints, some attributes are co-occurred a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009